Tabix: fast retrieval of sequence features from generic TAB-delimited files
نویسنده
چکیده
UNLABELLED Tabix is the first generic tool that indexes position sorted files in TAB-delimited formats such as GFF, BED, PSL, SAM and SQL export, and quickly retrieves features overlapping specified regions. Tabix features include few seek function calls per query, data compression with gzip compatibility and direct FTP/HTTP access. Tabix is implemented as a free command-line tool as well as a library in C, Java, Perl and Python. It is particularly useful for manually examining local genomic features on the command line and enables genome viewers to support huge data files and remote custom tracks over networks. AVAILABILITY AND IMPLEMENTATION http://samtools.sourceforge.net.
منابع مشابه
CleanEx: new data extraction and merging tools based on MeSH term annotation
The CleanEx expression database (http://www.cleanex.isb-sib.ch) provides access to public gene expression data via unique gene names as well as via experiments biomedical characteristics. To reach this, a dual annotation of both sequences and experiments has been generated. First, the system links official gene symbols to any kind of sequences used for gene expression measurements (cDNA, Affyme...
متن کاملPedro: a configurable data entry tool for XML
UNLABELLED Pedro is a Java application that dynamically generates data entry forms for data models expressed in XML Schema, producing XML data files that validate against this schema. The software uses an intuitive tree-based navigation system, can supply context-sensitive help to users and features a sophisticated interface for populating data fields with terms from controlled vocabularies. Th...
متن کاملThe UCSC Genome Browser Database
The University of California Santa Cruz (UCSC) Genome Browser Database is an up to date source for genome sequence data integrated with a large collection of related annotations. The database is optimized to support fast interactive performance with the web-based UCSC Genome Browser, a tool built on top of the database for rapid visualization and querying of the data at many levels. The annotat...
متن کاملGuitar Tab Mining, Analysis and Ranking
With over 4.5 million tablatures and chord sequences (collectively known as tabs), the web holds vast quantities of hand annotated scores in non-standardised text files. These scores are typically error-prone and incomplete, and tab collections contain many duplicates, making retrieval of high quality tabs difficult. Despite this, tabs are by far the most popular means of sharing musical instru...
متن کاملiREAD: A Tool for Intron Retention Detection from RNA-seq data
Summary: Detecting intron retention (IR) events is emerging as a specialized need for RNA-seq data analysis. Here we present iREAD (intron REtention Analysis and Detector), a tool to detect IR events genome-wide from high-throughput RNA-seq data. The command line interface for iREAD is implemented in Python. iREAD takes as input an existing BAM file, representing the transcriptome, and a text f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 27 5 شماره
صفحات -
تاریخ انتشار 2011